PLS-Optimal: A Stepwise D-Optimal Design Based on Latent Variables
نویسندگان
چکیده
Several applications, such as risk assessment within REACH or drug discovery, require reliable methods for the design of experiments and efficient testing strategies. Keeping the number of experiments as low as possible is important from both a financial and an ethical point of view, as exhaustive testing of compounds requires significant financial resources and animal lives. With a large initial set of compounds, experimental design techniques can be used to select a representative subset for testing. Once measured, these compounds can be used to develop quantitative structure-activity relationship models to predict properties of the remaining compounds. This reduces the required resources and time. D-Optimal design is frequently used to select an optimal set of compounds by analyzing data variance. We developed a new sequential approach to apply a D-Optimal design to latent variables derived from a partial least squares (PLS) model instead of principal components. The stepwise procedure selects a new set of molecules to be measured after each previous measurement cycle. We show that application of the D-Optimal selection generates models with a significantly improved performance on four different data sets with end points relevant for REACH. Compared to those derived from principal components, PLS models derived from the selection on latent variables had a lower root-mean-square error and a higher Q2 and R2. This improvement is statistically significant, especially for the small number of compounds selected.
منابع مشابه
A nonlinear PLS path modeling based on monotonic B-spline transformations
INTRODUCTION PLS path modeling is widely used in marketing applications. It is based on linear equations. However, in practical applications, many relations cannot be regarded as linear. For example, the relations between satisfaction and its attributes are nonlinear (Mittal et al., 1998). In this paper, we present a two step approach in order to include nonlinear relationships between manifest...
متن کاملLatent variable transformation using monotonic B-splines in PLS Path Modeling
PLS Path Modeling is a widely used approach in marketing. In that field, relationships between latent variables are frequently nonlinear. This nonlinearity is usually defined by a piecewise linear function. In this talk, we present an approach to include non linear transformations of the latent variables in the PLS path modeling algorithm with monotonic B-splines. We use optimal scaling methods...
متن کاملMulti-loop Internal Model Controller Design Based on a Dynamic PLS Framework
In this paper, a multi-loop internal model control (IMC) scheme in conjunction with feed-forward strategy based on the dynamic partial least squares (DyPLS) framework is proposed. Unlike the traditional methods to decouple multi-input multi-output (MIMO) systems, the DyPLS framework automatically decomposes the MIMO process into a multi-loop system in the PLS subspace in the modeling stage. The...
متن کاملDesignation of a Palm-Free Frying Oil Formulation Based on Sunflower, Canola, Corn and Sesame Oils Using D-Optimal Mixture Design
Background and Objectives: Oils used in frying should include special characteristics such as high oxidative stability, prolonged shelf life, low price, abundance and availability and desirable flavors. Nowadays, consumers are further interested in low saturated frying oils. Recently, manufacturers focus on eliminating palm oil derivatives (as a major vegetable source of saturation) from fryin...
متن کاملFeature Selection using Eigenvalue Optimization and Partial Least Squares
Feature selection is an essential problem in many fields such as computer vision. In this paper we introduce a supervised feature selection criterion based on Partial Least Squares regression (PLS). We find an optimal feature subset by applying the theory of Optimal Experiment Design to optimize the eigenvalues of the loadings matrix obtained from PLS. Since PLS extracts components such that th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of chemical information and modeling
دوره 52 4 شماره
صفحات -
تاریخ انتشار 2012